Context-Based Contrastive Learning for Scene Text Recognition

نویسندگان

چکیده

Pursuing accurate and robust recognizers has been a long-lasting goal for scene text recognition (STR) researchers. Recently, attention-based methods have demonstrated their effectiveness achieved impressive results on public benchmarks. The attention mechanism enables models to recognize with severe visual distortions by leveraging contextual information. However, recent studies revealed that the implicit over-reliance of context leads catastrophic out-of-vocabulary performance. On contrary superior accuracy seen text, are prone misrecognize unseen even good image quality. We propose novel framework, Context-based contrastive learning (ConCLR), alleviate this issue. Our proposed method first generates characters different contexts via simple concatenation operations then optimizes loss embeddings. By pulling together clusters identical within various pushing apart in embedding space, ConCLR suppresses side-effect overfitting specific learns more representation. Experiments show significantly improves generalization achieves state-of-the-art performance benchmarks recognizers.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequence to sequence learning for unconstrained scene text recognition

In this work we present a state-of-the-art approach for unconstrained natural scene text recognition. We propose a cascade approach that incorporates a convolutional neural network (CNN) architecture followed by a long short term memory model (LSTM). The CNN learns visual features for the characters and uses them with a softmax layer to detect sequence of characters. While the CNN gives very go...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Robust Scene Recognition Using Scene Context Information for Video Contents

We propose a robust scene recognition framework using scene context information for multimedia contents. In multimedia contents, some scene sequences are more likely to happen compared with other scene sequences. We employ a statistical approach to deal with this scene context information. We employ a hidden Markov model (HMM) to model each scene and an n-gram language model to represent the sc...

متن کامل

Accurate Scene Text Recognition Based on Recurrent Neural Network

Scene text recognition is a useful but very challenging task due to uncontrolled condition of text in natural scenes. This paper presents a novel approach to recognize text in scene images. In the proposed technique, a word image is first converted into a sequential column vectors based on Histogram of Oriented Gradient (HOG). The Recurrent Neural Network (RNN) is then adapted to classify the s...

متن کامل

Learning Context for Text Categorization

This paper describes our work which is based on discovering context for text document categorization. The document categorization approach is derived from a combination of a learning paradigm known as relation extraction and an technique known as context discovery. We demonstrate the effectiveness of our categorization approach using reuters 21578 dataset and synthetic real world data from spor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i3.20245